Convex relaxation for the planted clique, biclique, and clustering problems
نویسنده
چکیده
A clique of a graph G is a set of pairwise adjacent nodes of G. Similarly, a biclique(U, V ) of a bipartite graph G is a pair of disjoint, independent vertex sets such that eachnode in U is adjacent to every node in V in G. We consider the problems of identifyingthe maximum clique of a graph, known as the maximum clique problem, and identifyingthe biclique (U, V ) of a bipartite graph that maximizes the product |U | · |V |, known asthe maximum edge biclique problem. We show that finding a clique or biclique of a givensize in a graph is equivalent to finding a rank one matrix satisfying a particular set oflinear constraints. These problems can be formulated as rank minimization problems andrelaxed to convex programming by replacing rank with its convex envelope, the nuclearnorm. Both problems are NP-hard yet we show that our relaxation is exact in the casethat the input graph contains a large clique or biclique plus additional nodes and edges.For each problem, we provide two analyses of when our relaxation is exact. In the first,the diversionary edges are added deterministically by an adversary. In the second, eachpotential edge is added to the graph independently at random with fixed probability p. Inthe random case, our bounds match the earlier bounds of Alon, Krivelevich, and Sudakov,as well as Feige and Krauthgamer for the maximum clique problem.We extend these results and techniques to the k-disjoint-clique problem. The maximumnode k-disjoint-clique problem is to find a set of k disjoint cliques of a given input graphcontaining the maximum number of nodes. Given input graph G and nonnegative edgeweights w ∈ R+, the maximum mean weight k-disjoint-clique problem seeks to identify theset of k disjoint cliques of G that maximizes the sum of the average weights of the edges,with respect to w, of the complete subgraphs of G induced by the cliques. These problemsmay be considered as a way to pose the clustering problem. In clustering, one wants topartition a given data set so that the data items in each partition or cluster are similar andthe items in different clusters are dissimilar. For the graph G such that the set of nodesrepresents a given data set and any two nodes are adjacent if and only if the correspondingitems are similar, clustering the data into k disjoint clusters is equivalent to partitioningG into k-disjoint cliques. Similarly, given a complete graph with nodes corresponding to agiven data set and edge weights indicating similarity between each pair of items, the datamay be clustered by solving the maximum mean weight k-disjoint-clique problem.We show that both instances of the k-disjoint-clique problem can be formulated as rankconstrained optimization problems and relaxed to semidefinite programs using the nuclearnorm relaxation of rank. We also show that when the input instance corresponds to acollection of k disjoint planted cliques plus additional edges and nodes, this semidefiniterelaxation is exact for both problems. We provide theoretical bounds that guarantee ex-
منابع مشابه
Nuclear norm minimization for the planted clique and biclique problems
We consider the problems of finding a maximum clique in a graph and finding a maximum-edge biclique in a bipartite graph. Both problems are NP-hard. We write both problems as matrix-rank minimization and then relax them using the nuclear norm. This technique, which may be regarded as a generalization of compressive sensing, has recently been shown to be an effective way to solve rank optimizati...
متن کاملStatistical Limits of Convex Relaxations
Many high dimensional sparse learning problems are formulated as nonconvex optimization. A popular approach to solve these nonconvex optimization problems is through convex relaxations such as linear and semidefinite programming. In this paper, we study the statistical limits of convex relaxations. Particularly, we consider two problems: Mean estimation for sparse principal submatrix and edge p...
متن کاملOn the Statistical Limits of Convex Relaxations: A Case Study
Many high dimensional sparse learning problems are formulated as nonconvex optimization. A popular approach to solve these nonconvex optimization problems is through convex relaxations such as linear and semidefinite programming. In this paper, we study the statistical limits of convex relaxations. Particularly, we consider two problems: Mean estimation for sparse principal submatrix and edge p...
متن کاملDetecting Bicliques in GF[q]
We consider the problem of finding planted bicliques in random matrices over GF [q]. That is, our input matrix is a GF [q]-sum of an unknown biclique (rank-1 matrix) and a random matrix. We study different models for the random graphs and characterize the conditions when the planted biclique can be recovered. We also empirically show that a simple heuristic can reliably recover the planted bicl...
متن کاملRobust convex relaxation for the planted clique and densest k-subgraph problems
We consider the problem of identifying the densest k-node subgraph in a given graph. We write this problem as an instance of rank-constrained cardinality minimization and then relax using the nuclear and `1 norms. Although the original combinatorial problem is NP-hard, we show that the densest k-subgraph can be recovered from the solution of our convex relaxation for certain program inputs. In ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011